# Copy the "input.txt" file on your UBUNTU home

#Now lets move to the Spark shell by specifying master and number of computational core to use as standalone mode (here 4 cores for example):
$ spark-shell --master "local[4]" 

Step-1: Load packages: Load required FPGrowth package and other dependent packages:  

scala>import org.apache.spark.mllib.fpm.FPGrowth
scala>import org.apache.spark.{SparkConf, SparkContext}

Step-2: Read the transactions: Lets read the transactions as RDDs on the created Spark Context (sc) [see Figure 6]:  

scala>val transactions = sc.textFile(source.txt).map(_.split(" ")).cache()
Step-3: Check the number of transactions
Scala>println(s"Number of transactions: ${transactions.count()}")

Step-4: Create an FPGrowth model: Create the model by specifying minimum support threshold, and number of partitions:

scala>val model = new FPGrowth().setMinSupport(0.2).setNumPartitions(2).run(transactions)

Step-5: Check the number of frequent patterns (itemsets)

scala>println(s"Number of frequent itemsets: ${model.freqItemsets.count()}")

Step-6: Print patterns and support: Print the frequent pattern and their corresponding support/frequency counts (see Figure 10). Spark job will be running on local host (refer Figure 11).

scala>model.freqItemsets.collect().foreach { itemset => println(itemset.items.mkString("[", ",", "]") + ", " + itemset.freq)}

# You will see the output on the terminal as indicated in figure 10.
 
